展开的神经网络最近实现了最先进的MRI重建。这些网络通过在基于物理的一致性和基于神经网络的正则化之间交替来展开迭代优化算法。但是,它们需要大型神经网络的几次迭代来处理高维成像任务,例如3D MRI。这限制了基于反向传播的传统训练算法,这是由于较大的记忆力和计算梯度和存储中间激活的计算要求。为了应对这一挑战,我们提出了加速MRI(GLEAM)重建的贪婪学习,这是一种高维成像设置的有效培训策略。 GLEAM将端到端网络拆分为脱钩的网络模块。每个模块都以贪婪的方式优化,并通过脱钩的梯度更新,从而减少了训练过程中的内存足迹。我们表明,可以在多个图形处理单元(GPU)上并行执行解耦梯度更新,以进一步减少训练时间。我们介绍了2D和3D数据集的实验,包括多线圈膝,大脑和动态心脏Cine MRI。我们观察到:i)闪闪发光的概括以及最先进的记忆效率基线,例如具有相同内存足迹的梯度检查点和可逆网络,但训练速度更快1.3倍; ii)对于相同的内存足迹,闪光在2D中产生1.1dB PSNR的增益,而3D在端到端基线中产生1.8 dB。
translated by 谷歌翻译
There are multiple scales of abstraction from which we can describe the same image, depending on whether we are focusing on fine-grained details or a more global attribute of the image. In brain mapping, learning to automatically parse images to build representations of both small-scale features (e.g., the presence of cells or blood vessels) and global properties of an image (e.g., which brain region the image comes from) is a crucial and open challenge. However, most existing datasets and benchmarks for neuroanatomy consider only a single downstream task at a time. To bridge this gap, we introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. Our multi-task neuroimaging benchmark (MTNeuro) is built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions. We generated a number of different prediction challenges and evaluated several supervised and self-supervised models for brain-region prediction and pixel-level semantic segmentation of microstructures. Our experiments not only highlight the rich heterogeneity of this dataset, but also provide insights into how self-supervised approaches can be used to learn representations that capture multiple attributes of a single image and perform well on a variety of downstream tasks. Datasets, code, and pre-trained baseline models are provided at: https://mtneuro.github.io/ .
translated by 谷歌翻译
Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and developing tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions.
translated by 谷歌翻译
Recent work in large language modeling (LLMs) has used fine-tuning to align outputs with the preferences of a prototypical user. This work assumes that human preferences are static and homogeneous across individuals, so that aligning to a a single "generic" user will confer more general alignment. Here, we embrace the heterogeneity of human preferences to consider a different challenge: how might a machine help people with diverse views find agreement? We fine-tune a 70 billion parameter LLM to generate statements that maximize the expected approval for a group of people with potentially diverse opinions. Human participants provide written opinions on thousands of questions touching on moral and political issues (e.g., "should we raise taxes on the rich?"), and rate the LLM's generated candidate consensus statements for agreement and quality. A reward model is then trained to predict individual preferences, enabling it to quantify and rank consensus statements in terms of their appeal to the overall group, defined according to different aggregation (social welfare) functions. The model produces consensus statements that are preferred by human users over those from prompted LLMs (>70%) and significantly outperforms a tight fine-tuned baseline that lacks the final ranking step. Further, our best model's consensus statements are preferred over the best human-generated opinions (>65%). We find that when we silently constructed consensus statements from only a subset of group members, those who were excluded were more likely to dissent, revealing the sensitivity of the consensus to individual contributions. These results highlight the potential to use LLMs to help groups of humans align their values with one another.
translated by 谷歌翻译
Previous work has shown that a neural network with the rectified linear unit (ReLU) activation function leads to a convex polyhedral decomposition of the input space. These decompositions can be represented by a dual graph with vertices corresponding to polyhedra and edges corresponding to polyhedra sharing a facet, which is a subgraph of a Hamming graph. This paper illustrates how one can utilize the dual graph to detect and analyze adversarial attacks in the context of digital images. When an image passes through a network containing ReLU nodes, the firing or non-firing at a node can be encoded as a bit ($1$ for ReLU activation, $0$ for ReLU non-activation). The sequence of all bit activations identifies the image with a bit vector, which identifies it with a polyhedron in the decomposition and, in turn, identifies it with a vertex in the dual graph. We identify ReLU bits that are discriminators between non-adversarial and adversarial images and examine how well collections of these discriminators can ensemble vote to build an adversarial image detector. Specifically, we examine the similarities and differences of ReLU bit vectors for adversarial images, and their non-adversarial counterparts, using a pre-trained ResNet-50 architecture. While this paper focuses on adversarial digital images, ResNet-50 architecture, and the ReLU activation function, our methods extend to other network architectures, activation functions, and types of datasets.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Artificial Intelligence (AI) is one of the most transformative technologies of the 21st century. The extent and scope of future AI capabilities remain a key uncertainty, with widespread disagreement on timelines and potential impacts. As nations and technology companies race toward greater complexity and autonomy in AI systems, there are concerns over the extent of integration and oversight of opaque AI decision processes. This is especially true in the subfield of machine learning (ML), where systems learn to optimize objectives without human assistance. Objectives can be imperfectly specified or executed in an unexpected or potentially harmful way. This becomes more concerning as systems increase in power and autonomy, where an abrupt capability jump could result in unexpected shifts in power dynamics or even catastrophic failures. This study presents a hierarchical complex systems framework to model AI risk and provide a template for alternative futures analysis. Survey data were collected from domain experts in the public and private sectors to classify AI impact and likelihood. The results show increased uncertainty over the powerful AI agent scenario, confidence in multiagent environments, and increased concern over AI alignment failures and influence-seeking behavior.
translated by 谷歌翻译
神经网络容易受到对抗性攻击的攻击:在其输入中添加精心设计,不可察觉的扰动可以改变其输出。对抗训练是针对此类攻击的训练强大模型的最有效方法之一。不幸的是,这种方法比神经网络的香草培训要慢得多,因为它需要在每次迭代时为整个培训数据构建对抗性示例。通过利用核心选择理论,我们展示了如何选择一小部分训练数据提供了一种原则性的方法来降低健壮训练的时间复杂性。为此,我们首先为对抗核心选择提供收敛保证。特别是,我们表明收敛界限直接与我们的核心在整个训练数据中计算出的梯度的距离如何。在我们的理论分析的激励下,我们建议使用此梯度近似误差作为对抗核心选择目标,以有效地减少训练集大小。建造后,我们在培训数据的这一子集上进行对抗训练。与现有方法不同,我们的方法可以适应各种培训目标,包括交易,$ \ ell_p $ -pgd和感知性对手培训。我们进行了广泛的实验,以证明我们的进近可以使对抗性训练加快2-3次,同时在清洁和稳健的精度中略有降解。
translated by 谷歌翻译
尽管最近的研究集中在量化单词用法上以找到叙事情感弧的整体形状,但叙事中叙事的某些特征仍有待探索。在这里,我们通过找到单词用法中波动开始相关的文本长度来表征亚叙事的叙事时间尺度。我们代表30,000多个项目Gutenberg书籍作为时间序列使用OusiOmetrics,这是一个具有基本含义的功率破坏者框架,本身是对价价 - 宽松义务框架的重新解释,这些框架源自语义差异。我们使用经验模式分解将每本书的力量和危险时间序列分解为组成振荡模式和非振荡趋势的总和。通过将原始力量和危险时间序列的分解与从洗牌文本中得出的分解,我们发现较短的书籍仅显示出一般趋势,而较长的书籍除了一般趋势外,还具有波动,类似于子图在一个中的弧线中的弧线。总体叙事弧。这些波动通常有几千个单词的时期,无论书籍长度或库分类代码如何,但根据书的内容和结构而有所不同。我们的方法提供了一种数据驱动的denoisising方法,可用于各种长度的文本,与使用大型窗口尺寸的更传统的方法相反,该方法可能会无意中平滑相关信息,尤其是对于较短的文本而言。
translated by 谷歌翻译
对于许多应用,分析机器学习模型的不确定性是必不可少的。尽管不确定性量化(UQ)技术的研究对于计算机视觉应用非常先进,但对时空数据的UQ方法的研究较少。在本文中,我们专注于在线手写识别的模型,这是一种特定类型的时空数据。数据是从传感器增强的笔中观察到的,其目标是对书面字符进行分类。我们基于两种突出的贝叶斯推理,平均高斯(赃物)和深层合奏的突出技术对核心(数据)和认知(模型)UQ进行了广泛的评估。在对模型的更好理解后,UQ技术可以在组合右手和左撇子作家(一个代表性不足的组)时检测分布数据和域的变化。
translated by 谷歌翻译